Overview

Brought to you by YData

Dataset statistics

Number of variables11
Number of observations1000
Missing cells67
Missing cells (%)0.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory126.0 KiB
Average record size in memory129.1 B

Variable types

Categorical7
Numeric4

Alerts

Age is highly overall correlated with id_studentHigh correlation
Year is highly overall correlated with id_studentHigh correlation
gender is highly overall correlated with id_studentHigh correlation
id_student is highly overall correlated with Age and 6 other fieldsHigh correlation
lunch is highly overall correlated with id_studentHigh correlation
math score is highly overall correlated with reading score and 1 other fieldsHigh correlation
parental level of education is highly overall correlated with id_studentHigh correlation
race/ethnicity is highly overall correlated with id_studentHigh correlation
reading score is highly overall correlated with math score and 1 other fieldsHigh correlation
test preparation course is highly overall correlated with id_studentHigh correlation
writing score is highly overall correlated with math score and 1 other fieldsHigh correlation
Year is highly imbalanced (68.1%) Imbalance
Age has 67 (6.7%) missing values Missing
id_student is uniformly distributed Uniform
id_student has unique values Unique

Reproduction

Analysis started2025-03-16 21:58:33.160409
Analysis finished2025-03-16 21:58:46.292084
Duration13.13 seconds
Software versionydata-profiling vv4.14.0
Download configurationconfig.json

Variables

gender
Categorical

High correlation 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size47.9 KiB
MALE
517 
FEMALE
483 

Length

Max length6
Median length4
Mean length4.966
Min length4

Characters and Unicode

Total characters4966
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMALE
2nd rowFEMALE
3rd rowMALE
4th rowMALE
5th rowMALE

Common Values

ValueCountFrequency (%)
MALE 517
51.7%
FEMALE 483
48.3%

Length

2025-03-16T22:58:46.345614image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-03-16T22:58:46.388615image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
male 517
51.7%
female 483
48.3%

Most occurring characters

ValueCountFrequency (%)
E 1483
29.9%
M 1000
20.1%
A 1000
20.1%
L 1000
20.1%
F 483
 
9.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 4966
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
E 1483
29.9%
M 1000
20.1%
A 1000
20.1%
L 1000
20.1%
F 483
 
9.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 4966
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
E 1483
29.9%
M 1000
20.1%
A 1000
20.1%
L 1000
20.1%
F 483
 
9.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 4966
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
E 1483
29.9%
M 1000
20.1%
A 1000
20.1%
L 1000
20.1%
F 483
 
9.7%

race/ethnicity
Categorical

High correlation 

Distinct5
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size47.9 KiB
group C
323 
group D
262 
group B
205 
group E
131 
group A
79 

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters7000
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowgroup A
2nd rowgroup D
3rd rowgroup E
4th rowgroup B
5th rowgroup E

Common Values

ValueCountFrequency (%)
group C 323
32.3%
group D 262
26.2%
group B 205
20.5%
group E 131
13.1%
group A 79
 
7.9%

Length

2025-03-16T22:58:46.432147image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-03-16T22:58:46.495296image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
group 1000
50.0%
c 323
 
16.2%
d 262
 
13.1%
b 205
 
10.2%
e 131
 
6.6%
a 79
 
4.0%

Most occurring characters

ValueCountFrequency (%)
g 1000
14.3%
r 1000
14.3%
o 1000
14.3%
u 1000
14.3%
p 1000
14.3%
1000
14.3%
C 323
 
4.6%
D 262
 
3.7%
B 205
 
2.9%
E 131
 
1.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 7000
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
g 1000
14.3%
r 1000
14.3%
o 1000
14.3%
u 1000
14.3%
p 1000
14.3%
1000
14.3%
C 323
 
4.6%
D 262
 
3.7%
B 205
 
2.9%
E 131
 
1.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 7000
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
g 1000
14.3%
r 1000
14.3%
o 1000
14.3%
u 1000
14.3%
p 1000
14.3%
1000
14.3%
C 323
 
4.6%
D 262
 
3.7%
B 205
 
2.9%
E 131
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 7000
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
g 1000
14.3%
r 1000
14.3%
o 1000
14.3%
u 1000
14.3%
p 1000
14.3%
1000
14.3%
C 323
 
4.6%
D 262
 
3.7%
B 205
 
2.9%
E 131
 
1.9%

parental level of education
Categorical

High correlation 

Distinct6
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size47.9 KiB
some college
222 
associate's degree
203 
high school
202 
some high school
191 
bachelor's degree
112 

Length

Max length18
Median length16
Mean length14.55
Min length11

Characters and Unicode

Total characters14550
Distinct characters16
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowhigh school
2nd rowsome high school
3rd rowsome college
4th rowhigh school
5th rowassociate's degree

Common Values

ValueCountFrequency (%)
some college 222
22.2%
associate's degree 203
20.3%
high school 202
20.2%
some high school 191
19.1%
bachelor's degree 112
11.2%
master's degree 70
 
7.0%

Length

2025-03-16T22:58:46.550302image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-03-16T22:58:46.593163image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
some 413
18.8%
high 393
17.9%
school 393
17.9%
degree 385
17.6%
college 222
10.1%
associate's 203
9.3%
bachelor's 112
 
5.1%
master's 70
 
3.2%

Most occurring characters

ValueCountFrequency (%)
e 2397
16.5%
o 1736
11.9%
s 1667
11.5%
h 1291
8.9%
1191
8.2%
g 1000
6.9%
l 949
 
6.5%
c 930
 
6.4%
i 596
 
4.1%
a 588
 
4.0%
Other values (6) 2205
15.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 14550
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 2397
16.5%
o 1736
11.9%
s 1667
11.5%
h 1291
8.9%
1191
8.2%
g 1000
6.9%
l 949
 
6.5%
c 930
 
6.4%
i 596
 
4.1%
a 588
 
4.0%
Other values (6) 2205
15.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 14550
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 2397
16.5%
o 1736
11.9%
s 1667
11.5%
h 1291
8.9%
1191
8.2%
g 1000
6.9%
l 949
 
6.5%
c 930
 
6.4%
i 596
 
4.1%
a 588
 
4.0%
Other values (6) 2205
15.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 14550
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 2397
16.5%
o 1736
11.9%
s 1667
11.5%
h 1291
8.9%
1191
8.2%
g 1000
6.9%
l 949
 
6.5%
c 930
 
6.4%
i 596
 
4.1%
a 588
 
4.0%
Other values (6) 2205
15.2%

lunch
Categorical

High correlation 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size47.9 KiB
standard
652 
free/reduced
348 

Length

Max length12
Median length8
Mean length9.392
Min length8

Characters and Unicode

Total characters9392
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowstandard
2nd rowfree/reduced
3rd rowfree/reduced
4th rowstandard
5th rowstandard

Common Values

ValueCountFrequency (%)
standard 652
65.2%
free/reduced 348
34.8%

Length

2025-03-16T22:58:46.657454image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-03-16T22:58:46.692460image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
standard 652
65.2%
free/reduced 348
34.8%

Most occurring characters

ValueCountFrequency (%)
d 2000
21.3%
e 1392
14.8%
r 1348
14.4%
a 1304
13.9%
s 652
 
6.9%
t 652
 
6.9%
n 652
 
6.9%
f 348
 
3.7%
/ 348
 
3.7%
u 348
 
3.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 9392
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
d 2000
21.3%
e 1392
14.8%
r 1348
14.4%
a 1304
13.9%
s 652
 
6.9%
t 652
 
6.9%
n 652
 
6.9%
f 348
 
3.7%
/ 348
 
3.7%
u 348
 
3.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 9392
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
d 2000
21.3%
e 1392
14.8%
r 1348
14.4%
a 1304
13.9%
s 652
 
6.9%
t 652
 
6.9%
n 652
 
6.9%
f 348
 
3.7%
/ 348
 
3.7%
u 348
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 9392
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
d 2000
21.3%
e 1392
14.8%
r 1348
14.4%
a 1304
13.9%
s 652
 
6.9%
t 652
 
6.9%
n 652
 
6.9%
f 348
 
3.7%
/ 348
 
3.7%
u 348
 
3.7%

test preparation course
Categorical

High correlation 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size47.9 KiB
none
665 
completed
335 

Length

Max length9
Median length4
Mean length5.675
Min length4

Characters and Unicode

Total characters5675
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowcompleted
2nd rownone
3rd rownone
4th rownone
5th rowcompleted

Common Values

ValueCountFrequency (%)
none 665
66.5%
completed 335
33.5%

Length

2025-03-16T22:58:46.735368image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-03-16T22:58:46.767773image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
none 665
66.5%
completed 335
33.5%

Most occurring characters

ValueCountFrequency (%)
e 1335
23.5%
n 1330
23.4%
o 1000
17.6%
c 335
 
5.9%
m 335
 
5.9%
p 335
 
5.9%
l 335
 
5.9%
t 335
 
5.9%
d 335
 
5.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 5675
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 1335
23.5%
n 1330
23.4%
o 1000
17.6%
c 335
 
5.9%
m 335
 
5.9%
p 335
 
5.9%
l 335
 
5.9%
t 335
 
5.9%
d 335
 
5.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 5675
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 1335
23.5%
n 1330
23.4%
o 1000
17.6%
c 335
 
5.9%
m 335
 
5.9%
p 335
 
5.9%
l 335
 
5.9%
t 335
 
5.9%
d 335
 
5.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 5675
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 1335
23.5%
n 1330
23.4%
o 1000
17.6%
c 335
 
5.9%
m 335
 
5.9%
p 335
 
5.9%
l 335
 
5.9%
t 335
 
5.9%
d 335
 
5.9%

math score
Real number (ℝ)

High correlation 

Distinct78
Distinct (%)7.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean66.436
Minimum13
Maximum120
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size47.9 KiB
2025-03-16T22:58:47.010269image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum13
5-th percentile40
Q156
median66.5
Q377
95-th percentile91
Maximum120
Range107
Interquartile range (IQR)21

Descriptive statistics

Standard deviation15.489927
Coefficient of variation (CV)0.23315563
Kurtosis-0.14193308
Mean66.436
Median Absolute Deviation (MAD)10.5
Skewness-0.11548843
Sum66436
Variance239.93784
MonotonicityNot monotonic
2025-03-16T22:58:47.076279image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
63 34
 
3.4%
71 30
 
3.0%
77 30
 
3.0%
74 28
 
2.8%
57 27
 
2.7%
66 26
 
2.6%
58 26
 
2.6%
65 25
 
2.5%
70 25
 
2.5%
78 24
 
2.4%
Other values (68) 725
72.5%
ValueCountFrequency (%)
13 2
0.2%
23 1
 
0.1%
25 1
 
0.1%
26 2
0.2%
28 2
0.2%
29 1
 
0.1%
30 2
0.2%
31 2
0.2%
32 2
0.2%
33 4
0.4%
ValueCountFrequency (%)
120 1
 
0.1%
100 14
1.4%
99 3
 
0.3%
98 3
 
0.3%
97 3
 
0.3%
96 3
 
0.3%
95 2
 
0.2%
94 7
0.7%
93 5
 
0.5%
92 7
0.7%

reading score
Real number (ℝ)

High correlation 

Distinct85
Distinct (%)8.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean64.951
Minimum15
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size47.9 KiB
2025-03-16T22:58:47.138605image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum15
5-th percentile29
Q154
median68
Q378
95-th percentile93
Maximum100
Range85
Interquartile range (IQR)24

Descriptive statistics

Standard deviation19.010049
Coefficient of variation (CV)0.29268294
Kurtosis-0.41685377
Mean64.951
Median Absolute Deviation (MAD)12
Skewness-0.47058794
Sum64951
Variance361.38198
MonotonicityNot monotonic
2025-03-16T22:58:47.201155image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
63 32
 
3.2%
71 29
 
2.9%
73 29
 
2.9%
64 28
 
2.8%
78 27
 
2.7%
76 25
 
2.5%
72 23
 
2.3%
70 23
 
2.3%
87 23
 
2.3%
62 23
 
2.3%
Other values (75) 738
73.8%
ValueCountFrequency (%)
15 1
 
0.1%
17 3
0.3%
18 1
 
0.1%
19 5
0.5%
20 1
 
0.1%
21 1
 
0.1%
22 1
 
0.1%
23 4
0.4%
24 6
0.6%
25 7
0.7%
ValueCountFrequency (%)
100 19
1.9%
99 2
 
0.2%
98 2
 
0.2%
97 3
 
0.3%
96 4
 
0.4%
95 9
0.9%
94 4
 
0.4%
93 8
0.8%
92 7
 
0.7%
91 10
1.0%

writing score
Real number (ℝ)

High correlation 

Distinct57
Distinct (%)5.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean70.339
Minimum23
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size47.9 KiB
2025-03-16T22:58:47.262029image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum23
5-th percentile42
Q158
median68
Q379
95-th percentile100
Maximum100
Range77
Interquartile range (IQR)21

Descriptive statistics

Standard deviation19.160094
Coefficient of variation (CV)0.27239645
Kurtosis-0.72643607
Mean70.339
Median Absolute Deviation (MAD)11
Skewness0.22996218
Sum70339
Variance367.10919
MonotonicityNot monotonic
2025-03-16T22:58:47.327611image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100 225
22.5%
64 34
 
3.4%
71 34
 
3.4%
66 30
 
3.0%
70 29
 
2.9%
65 29
 
2.9%
73 29
 
2.9%
60 27
 
2.7%
68 25
 
2.5%
63 24
 
2.4%
Other values (47) 514
51.4%
ValueCountFrequency (%)
23 2
0.2%
24 1
 
0.1%
26 1
 
0.1%
27 2
0.2%
28 2
0.2%
30 1
 
0.1%
31 2
0.2%
32 3
0.3%
33 4
0.4%
34 1
 
0.1%
ValueCountFrequency (%)
100 225
22.5%
80 17
 
1.7%
79 17
 
1.7%
78 18
 
1.8%
77 17
 
1.7%
76 22
 
2.2%
75 20
 
2.0%
74 17
 
1.7%
73 29
 
2.9%
72 23
 
2.3%

id_student
Real number (ℝ)

High correlation  Uniform  Unique 

Distinct1000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1499.5
Minimum1000
Maximum1999
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size47.9 KiB
2025-03-16T22:58:47.392198image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum1000
5-th percentile1049.95
Q11249.75
median1499.5
Q31749.25
95-th percentile1949.05
Maximum1999
Range999
Interquartile range (IQR)499.5

Descriptive statistics

Standard deviation288.81944
Coefficient of variation (CV)0.19261049
Kurtosis-1.2
Mean1499.5
Median Absolute Deviation (MAD)250
Skewness0
Sum1499500
Variance83416.667
MonotonicityStrictly increasing
2025-03-16T22:58:47.457305image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1000 1
 
0.1%
1671 1
 
0.1%
1658 1
 
0.1%
1659 1
 
0.1%
1660 1
 
0.1%
1661 1
 
0.1%
1662 1
 
0.1%
1663 1
 
0.1%
1664 1
 
0.1%
1665 1
 
0.1%
Other values (990) 990
99.0%
ValueCountFrequency (%)
1000 1
0.1%
1001 1
0.1%
1002 1
0.1%
1003 1
0.1%
1004 1
0.1%
1005 1
0.1%
1006 1
0.1%
1007 1
0.1%
1008 1
0.1%
1009 1
0.1%
ValueCountFrequency (%)
1999 1
0.1%
1998 1
0.1%
1997 1
0.1%
1996 1
0.1%
1995 1
0.1%
1994 1
0.1%
1993 1
0.1%
1992 1
0.1%
1991 1
0.1%
1990 1
0.1%

Year
Categorical

High correlation  Imbalance 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size47.9 KiB
2023
942 
1990
 
58

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters4000
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023
2nd row2023
3rd row2023
4th row2023
5th row2023

Common Values

ValueCountFrequency (%)
2023 942
94.2%
1990 58
 
5.8%

Length

2025-03-16T22:58:47.514601image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-03-16T22:58:47.544581image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
2023 942
94.2%
1990 58
 
5.8%

Most occurring characters

ValueCountFrequency (%)
2 1884
47.1%
0 1000
25.0%
3 942
23.5%
9 116
 
2.9%
1 58
 
1.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 4000
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
2 1884
47.1%
0 1000
25.0%
3 942
23.5%
9 116
 
2.9%
1 58
 
1.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 4000
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
2 1884
47.1%
0 1000
25.0%
3 942
23.5%
9 116
 
2.9%
1 58
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 4000
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
2 1884
47.1%
0 1000
25.0%
3 942
23.5%
9 116
 
2.9%
1 58
 
1.5%

Age
Categorical

High correlation  Missing 

Distinct4
Distinct (%)0.4%
Missing67
Missing (%)6.7%
Memory size47.9 KiB
14.0
259 
17.0
242 
16.0
225 
15.0
207 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters3732
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row14.0
2nd row17.0
3rd row14.0
4th row17.0
5th row16.0

Common Values

ValueCountFrequency (%)
14.0 259
25.9%
17.0 242
24.2%
16.0 225
22.5%
15.0 207
20.7%
(Missing) 67
 
6.7%

Length

2025-03-16T22:58:47.578611image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-03-16T22:58:47.611216image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
14.0 259
27.8%
17.0 242
25.9%
16.0 225
24.1%
15.0 207
22.2%

Most occurring characters

ValueCountFrequency (%)
1 933
25.0%
. 933
25.0%
0 933
25.0%
4 259
 
6.9%
7 242
 
6.5%
6 225
 
6.0%
5 207
 
5.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 3732
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 933
25.0%
. 933
25.0%
0 933
25.0%
4 259
 
6.9%
7 242
 
6.5%
6 225
 
6.0%
5 207
 
5.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 3732
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 933
25.0%
. 933
25.0%
0 933
25.0%
4 259
 
6.9%
7 242
 
6.5%
6 225
 
6.0%
5 207
 
5.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 3732
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 933
25.0%
. 933
25.0%
0 933
25.0%
4 259
 
6.9%
7 242
 
6.5%
6 225
 
6.0%
5 207
 
5.5%

Interactions

2025-03-16T22:58:38.673008image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-03-16T22:58:33.610202image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-03-16T22:58:35.395671image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-03-16T22:58:37.097642image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-03-16T22:58:40.246397image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-03-16T22:58:33.687890image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-03-16T22:58:35.447559image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-03-16T22:58:37.156415image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-03-16T22:58:41.798723image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-03-16T22:58:33.740211image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-03-16T22:58:35.496776image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-03-16T22:58:37.214861image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-03-16T22:58:43.233711image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-03-16T22:58:33.801650image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-03-16T22:58:35.555249image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-03-16T22:58:37.274107image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Correlations

2025-03-16T22:58:47.653129image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
AgeYeargenderid_studentlunchmath scoreparental level of educationrace/ethnicityreading scoretest preparation coursewriting score
Age1.0000.0000.0001.0000.0000.0470.0610.0490.0380.0290.000
Year0.0001.0000.0351.0000.0000.0000.0130.0000.0240.0330.056
gender0.0000.0351.0001.0000.0040.2070.1070.0660.1720.0000.225
id_student1.0001.0001.0001.0001.0000.0261.0001.0000.0211.0000.011
lunch0.0000.0000.0041.0001.0000.3610.0000.0720.2660.0000.309
math score0.0470.0000.2070.0260.3611.0000.1000.1330.7200.1520.792
parental level of education0.0610.0130.1071.0000.0000.1001.0000.0000.3090.0340.119
race/ethnicity0.0490.0000.0661.0000.0720.1330.0001.0000.0730.0000.102
reading score0.0380.0240.1720.0210.2660.7200.3090.0731.0000.3310.841
test preparation course0.0290.0330.0001.0000.0000.1520.0340.0000.3311.0000.295
writing score0.0000.0560.2250.0110.3090.7920.1190.1020.8410.2951.000

Missing values

2025-03-16T22:58:46.172112image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
A simple visualization of nullity by column.
2025-03-16T22:58:46.238051image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

genderrace/ethnicityparental level of educationlunchtest preparation coursemath scorereading scorewriting scoreid_studentYearAge
0MALEgroup Ahigh schoolstandardcompleted6767631000202314.0
1FEMALEgroup Dsome high schoolfree/reducednone4029551001202317.0
2MALEgroup Esome collegefree/reducednone5960501002202314.0
3MALEgroup Bhigh schoolstandardnone7778681003202317.0
4MALEgroup Eassociate's degreestandardcompleted7873681004202316.0
5FEMALEgroup Dhigh schoolstandardnone6377761005202316.0
6FEMALEgroup Abachelor's degreestandardnone6259631006202314.0
7MALEgroup Esome collegestandardcompleted93881001007202317.0
8MALEgroup Dhigh schoolstandardnone6356651008202315.0
9MALEgroup Csome collegefree/reducednone4742451009202316.0
genderrace/ethnicityparental level of educationlunchtest preparation coursemath scorereading scorewriting scoreid_studentYearAge
990MALEgroup Dsome collegestandardnone6755531990202316.0
991FEMALEgroup Cassociate's degreestandardnone87931001991202317.0
992MALEgroup Csome collegestandardnone6963661992202315.0
993FEMALEgroup Aassociate's degreestandardnone5854581993202315.0
994MALEgroup Ehigh schoolfree/reducedcompleted8682751994202316.0
995MALEgroup Chigh schoolstandardnone7370651995199015.0
996MALEgroup Dassociate's degreefree/reducedcompleted85911001996202314.0
997FEMALEgroup Csome high schoolfree/reducednone3217411997202317.0
998FEMALEgroup Csome collegestandardnone73741001998202317.0
999MALEgroup Asome collegestandardcompleted6560621999202315.0